This paper presents a video summarization method that is specifically for the static summary of consumer videos.\nConsidering that the consumer videos usually have unclear shot boundaries and many low-quality or meaningless\nframes, we propose a two-step approach where the first step skims a video and the second step performs\ncontent-aware clustering with keyframe selection. Specifically, the first step removes most of redundant frames that\ncontain only little new information by employing the spectral clustering method with color histogram features. As a\nresult, we obtain a condensed video that is shorter and has clearer temporal boundaries than the original. In the\nsecond step, we perform rough temporal segmentation and then apply refined clustering for each of the temporal\nsegments, where each frame is represented by the sparse coding of SIFT features. The keyframe selection from each\ncluster is based on the measure of representativeness and visual quality of frames, where the representativeness is\ndefined from the sparse coding and the visual quality is the combination of contrast, blur, and image skew measures.\nThe problem of keyframe selection is to find the frames that have both representativeness and high quality, which is\nformulated as an optimization problem. Experiments on videos with various lengths show that the resulting\nsummaries closely follow the important contents of videos.
Loading....